Recent Developments in Multilayer Perceptron Neural Networks
نویسندگان
چکیده
Several neural network architectures have been developed over the past several years. One of the most popular and most powerful architectures is the multilayer perceptron. This architecture will be described in detail and recent advances in training of the multilayer perceptron will be presented. Multilayer perceptrons are trained using various techniques. For years the most used training method was back propagation and various derivatives of this to incorporate gradient information. Recent developments have used output weight optimization-hidden weight optimization (OWO-HWO) and full conjugate gradient methods. OWO-HWO is a very powerful technique in terms of accuracy and rapid convergence. OWO-HWO has been used with a unique “network growing” technique to ensure that the mean square error is monotonically non-increasing as the network size increases (i.e., the number of hidden layer nodes increases). This “network growing” technique was trained using OWO-HWO but is amenable to any training technique. This technique significantly improves training and testing performance of the MLP. 1.0 Introduction Several neural networks have been developed and analyzed over the last few decades. These include self-organizing neural networks [1, 2], the Hopfield network [3, 4], radial basis function networks [5-6], the Boltzmann machine [7], the mean-field theory machine [8] and multilayer perceptrons (MLPs) [9-17]. MLPs have evolved over the years as a very powerful technique for solving a wide variety of problems. Much progress has been made in improving performance and in understanding how these neural networks operate. However, the need for additional improvements in training these networks still exists since the training process is very chaotic in nature. Proceedings of the 7 Annual Memphis Area Engineering and Science Conference MAESC 2005 1 2.0 Structure and Operation of Multilayer Perceptron Neural Networks MLP neural networks consist of units arranged in layers. Each layer is composed of nodes and in the fully connected networks considered in this paper each node connects to every node in subsequent layers. Each MLP is composed of a minimum of three layers consisting of an input layer, one or more hidden layer(s) and an output layer. The above definition ignores the degenerate linear multilayer perceptron consisting of only an input layer and an output layer. The input layer distributes the inputs to subsequent layers. Input nodes have linear activation functions and no thresholds. Each hidden unit node and each output node have thresholds associated with them in addition to the weights. The hidden unit nodes have nonlinear activation functions and the outputs have linear activation functions. Hence, each signal feeding into a node in a subsequent layer has the original input multiplied by a weight with a threshold added and then is passed through an activation function that may be linear or nonlinear (hidden units). A typical three-layer network is shown in figure 1. Only three layer MLPs will be considered in this paper since these networks have been shown to approximate any continuous function [18-20]. For the actual three-layer MLP, all of the inputs are also connected directly to all of the outputs. These connections are not shown in figure 1 to simplify the diagram. Figure 2.1. Multilayer perceptron with one hidden layer. Figure 2.1. Typical three-layer multilayer perceptron neural network. Figure 1. Typical three-layer multilayer perceptron neural network. Output Layer Hidden Layer Input Layer net (1) O (1) w oh(1,1) y p p The training data consists of a set of Nv training patterns (xp, tp) where p represents the pattern number. In figure 1, xp corresponds to the N-dimensional input p(1) y p(2)
منابع مشابه
Application of multilayer perceptron neural network and support vector machine for modeling the hydrodynamic behavior of permeable breakwaters with porous core
In this research, the application of multilayer perceptron (MLP) neural networks and support vector machine (SVM) for modeling the hydrodynamic behavior of Permeable Breakwaters with Porous Core has been investigated. For this purpose, experimental data have been used on the physical model to relate the reflection and transition coefficients of incident waves as the output parameters to the wid...
متن کاملHourly Wind Speed Prediction using ARMA Model and Artificial Neural Networks
In this paper, a comparison study is presented on artificial intelligence and time series models in 1-hour-ahead wind speed forecasting. Three types of typical neural networks, namely adaptive linear element, multilayer perceptrons, and radial basis function, and ARMA time series model are investigated. The wind speed data used are the hourly mean wind speed data collected at Binalood site in I...
متن کاملNeural Nets via Forward State Transformation and Backward Loss Transformation
This article studies (multilayer perceptron) neural networks with an emphasis on the transformations involved — both forward and backward — in order to develop a semantical/logical perspective that is in line with standard program semantics. The common two-pass neural network training algorithms make this viewpoint particularly fitting. In the forward direction, neural networks act as state tra...
متن کاملUnderstanding Convolutional Neural Networks
This seminar paper focusses on convolutional neural networks and a visualization technique allowing further insights into their internal operation. After giving a brief introduction to neural networks and the multilayer perceptron, we review both supervised and unsupervised training of neural networks in detail. In addition, we discuss several approaches to regularization. The second section in...
متن کاملStock market prediction using different neural network classification architectures
In recent years, many attempts have been made to predict the behavior of bonds, currencies, stocks, or stock markets. In this paper, the StandardlkPoors 500 Index is modeled using different neural network classification architectures. Most previous experiments used multilayer perceptrons for stock market forecasting. In this paper, a multilayer perceptron architecture and ZL probabilistic neura...
متن کامل